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The minimum spanning tree (MST) is a combinatorial optimization problem; given a connected 
graph with a real weight ("cost") on each edge, find the spanning tree that minimizes the sum of 
the total cost of the occupied edges. We consider the random MST, in which the edge costs are 
(quenched) independent random variables. There is a strongly-disordered spin-glass model due to 
Newman and Stein [Phys. Rev. Lett. 72, 2286 (1994)], which maps precisely onto the random MST. 
We study scaling properties of random MSTs using a relation between Kruskal's greedy algorithm 
for finding the MST, and bond percolation. We solve the random MST problem on the Bethe 
lattice (BL) with appropriate wired boundary conditions and calculate the fractal dimension D = 6 
of the connected components. Viewed as a mean-field theory, the result implies that on a lattice 
in Euclidean space of dimension d, there are of order VV^'^ large connected components of the 
random MST inside a window of size W , and that d = dc = -D = 6isa critical dimension. This 
differs from the value 8 suggested by Newman and Stein. We also critique the original argument 
for 8, and provide an improved scaling argument that again yields dc = The result implies that 
the strongly-disordered spin-glass model has many ground states for d > 6, and only of order one 
below six. The results for MSTs also apply on the Poisson-weighted infinite tree, which is a mean- 
field approach to the continuum model of MSTs in Euclidean space, and is a limit of the BL. In a 
companion paper we develop an e = 6 — d expansion for the random MST on critical percolation 
clusters. 
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I. INTRODUCTION 

A. Motivation and approach 

The minimum spanning tree (MST) problem is one 
of the oldest and best-studied problems of combinatorial 
optimization [l|, [2, H, 0, Q and has found application to 
the physics of random systems [1, 0, B 9 • To define the 
problem, we consider an undirected, connected graph G 
with vertex set V ^ edge set E and a real-valued cost te 
assigned to each edge e € E. A spanning tree is then 
defined as a subset of the edges of G that connects all 
the vertices and contains no cycles: in other words, it is 
a tree and it spans V. Such a tree must exist because the 
graph is assumed connected. A minimum spanning tree 
T is a spanning tree such that the sum of the costs of its 
edges, 



^(T) = ^4 



(LI) 



eGT 



is minimized over the set of all spanning trees on G. If the 
costs ie are strictly positive, then any spanning subset of 
the edges that has minimum cost is automatically a tree. 

If we view the cost (|I.ip as an energy, then this is a 
problem of finding the ground state of a classical system 
in which the configurations are spanning trees. If the 
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costs of the edges are taken to be random variables, then 
this becomes a classical system with quenched disorder: 
the minimum must be found for a fixed set of costs, be- 
fore any averaging over realizations of the edge costs is 
performed. 

In the present paper and its companion p^ . we be- 
gin a program to develop an analytical theory of the 
statistical geometry of random MSTs on a lattice Z'^ 
in d-dimensional Euclidean space, where the edge costs 
are independently and identically distributed (iid) with a 
continuous probability distribution. (For continuous dis- 
tributions of iid edge costs, the MST on a finite graph 
is unique with probability one.) We will primarily be 
interested in this model's long-range scaling properties 
(to be defined in the following section) because these are 
predicted to be the same for all models in the same uni- 
versality class: in a quantitative sense, changing micro- 
scopic details of the model such as the type of lattice will 
not affect these properties (see e.g. Ref. [lli for further 
discussion). We will argue below that, due to univer- 
sality, our results also appl y t o related problems, such 
as the continuum model |5l. Il2l|. In this model, the ver- 
tices are points in Euclidean space M.'^ which are Poisson 
distributed with uniform density, so the graph is the in- 
finite complete graph, and the cost assigned to each edge 
is the Euclidean distance between its endpoints. Results 
concerning the expectation value of the cost ({T) were 
previously given in Ref. [13]. 

It is simple to solve computationally an instance of 
the MST problem, and many efhcient algorithms exist 
d, [H, [l^ m, [l3. However, we are interested in the 
statistical properties of the random MST, which is rele- 
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vant to physical disordered systems. MSTs play a role 
in transport in disordered networks [H, [H, [13, HH, [13] , 
for example in current flow in random resistor networks 
(Tsl [23I [H, [1^ . There exist in the literature several nu- 
merical simulations [i, [3, [11,113, [Hi determining Dp, the 
fractal dimension of the optimal path. 

Furthermore, Newman and Stein (hereafter referred to 
as NS) have shown [1,[^ that in the strong disorder limit, 
the problem of finding the ground state of an Ising spin 
glass may be directly mapped on to the MST problem. 
Some of the results presented here bear directly on the 
discussion by these authors, which relate to fundamen- 
tal questions about the ground state structure of spin 
glasses, and, by extension, solutions to hard optimization 
problems H, [H, [3l| . Related considerations also arose 
in the quantum spin glass (or random transverse field 
Ising model) [HI . These connections will be discussed at 
length below. 

Our program for MSTs takes a form similar to those for 
critical phenomena in, for example, Ising spin systems, 
or percolation. It proceeds in stages. The first stage is 
a mean-field theory, which we develop in the present pa- 
per by introducing and solving a version of random MST 
on the Bethe lattice (BL) (or Cayley tree), with suitable 
boundary conditions. The BL model can be solved ex- 
actly in some cases, or exactly for asymptotic, universal 
properties in more cases. We also investigate the related 
Poisson- weighted infinite tree (PWIT) model [1^,^3, 34], 
which can be obtained as a limit of the BL as the coordi- 
nation number goes to infinity and which can be viewed 
as the mean-field theory of the continuum model defined 
above. The subsequent stages would be to develop a full 
statistical field theory (or at least, a perturbation ex- 
pansion) for the geometric properties in Euclidean space 
(either for the lattice or continuum models). Then a 
renormalization group approach should first show that 
corrections to the mean-field theory do not change the 
universal asymptotic properties above some critical di- 
mension dc, and identify dc- A third stage would be 
calculation of universal properties &t d < dc in an epsilon 
expansion in powers oi e = dc — d. A fourth stage would 
be to show that the epsilon expansion is Borel summable, 
so that it defines the true results. The second and later 
stages will not be completed in this paper. 

In developing an approach to the MST problem using 
Kruskal's greedy algorithm [Ti^ , which is related to bond 
percolation, we will be led to consider the process of find- 
ing the minimum spanning forest (MSF) on the clusters 
of bond percolation as a function of the probability p that 
an edge is occupied. (A spanning forest is a spanning, 
vertex-disjoint collection of trees.) We call this process 
MSF(p) (note that MSF(l) is the same as the MST). 
We will argue that the universal properties of MSF(p) at 
any p > pc, where pc is the percolation threshold of the 
model, are the same. At this time we have been able to 
develop the perturbation techniques as mentioned above 
only for MSF(p) with p < Pc- The expansion at p > pc 
is more difficult. However, it has been argued that some 



properties for p > pc are the same as those for p = pc 
pol [H, [13, [11] ■ In the absence of a perturbation theory 
treatment, it is not clear if this is true, but certainly we 
find many indications on the BL that the vicinity of pc 
dominates the behavior in the region p > pc- In any case, 
in the companion (to be referred to as II) to the present 
paper we develop first a small-p expansion that is ex- 
act for any p on a finite graph, and then a perturbation 
expansion using modified Feynman diagram techniques 
valid for p < Pc- By using renormalization-group tech- 
niques, this yields an epsilon expansion for the fractal 
dimension of a path on MSF(pc)- 



B. Outline and discussion of main results 

We now describe the main results of this work. The 
BL is an infinite tree with fixed coordination number (de- 
gree) at each vertex. As it is itself a tree, the MST on 
such a graph would be simply the whole graph, so atten- 
tion to the definition of boundary conditions in the spirit 
ofii is essential in producing nontrivial behavior. The 
boundary conditions can be introduced first on a finite 
version of the BL, and then the infinite size limit can be 
taken. Specifically, as discussed in detail in Sec. [Til we 
adopt a wired boundary condition, which has the result 
that instead of the MST, the minimum object is a span- 
ning forest. (This MSF produced by the wired boundary 
condition should not be confused with that in the process 
MSF(p) mentioned above; in that language, at present we 
are considering the minimum objects ai p — 1.) We are 
interested in the statistical geometry of this non-trivial 
random forest. As the size of the lattice goes to infinity, 
the statistical properties of the MSF have a well-defined 
limit. We call the number of vertices that are connected 
to the central site and lie within m steps on the BL the 
"mass" M{m) within m steps. We can then calculate 
(among other things) its expectation value M(m), and 
we find that it scales as 



M(m) 



(1.2) 



as TO —> 00. The same result holds for the PWIT. 

Employing this result as a mean-field theory, the stan- 
dard method (see, e.g., [8, 9, 35]) for transferring results 
from the BL to a Euclidean lattice entails that distance 
TO on the BL corresponds to distance squared, m ~ i?^ 
on the Euclidean lattice. We find that the expected mass 
within distance R of the origin scales as 



M{R) - 



(1.3) 



so that the tree has fractal dimension D — 6. As the trees 
fill the lattice, this means that the expected number of 
connected components that intersect a ball of radius R, 
denoted N{R), scales as 



N{R) - i?# - R''- 



(1.4) 
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as — > oo, so the tree "proliferation exponent" [H, [s^l 
^ = d— D — d—6. Note that on the BL, N{m) increases 
exponentiaUy with m, but there is a power-law correction 
factor of TO~^ which produces the behavior relevant for 
the Euclidean lattice. 

Two points should be explained here. One is that 
N{R) may be subject to boundary effects at the surface 
of the ball, so that the number of components intersect- 
ing the ball is larger, maybe R'^~^. Our result is expected 
to describe the number of large components intersecting 
the ball, say those whose intersection with the ball is of 
linear size R/2 

The second, very important, point is that the MST on 
a finite connected portion of a Euclidean lattice is again 
a connected object by definition (for conventional "free" 
boundary conditions). But as this portion becomes larger 
and approaches the whole lattice, the path between any 
two vertices on the tree may make larger and larger ex- 
cursions, so that in the limit, viewed locally on any fi- 
nite length scale R, the tree appears to be a forest of 
many connected components [3,1^ . The use of the wired 
boundary condition is intended to simulate this possi- 
ble effect, by producing such a forest in a finite system, 
though the properties near the wired boundary may dif- 
fer from those near the boundary of a ball of radius R 
inside a system of size much larger than R. 

The proliferation exponent # cannot become negative, 
so even if the mean field theory is indeed valid in high 
dimensions, it must break down in sufficiently low dimen- 
sions. These notions parallel some in other problems of 
random fractal clusters, such as in percolation at thresh- 
old (the critical point). A non-zero proliferation expo- 
nent # is the geometric counterpart to the violation of 
hyperscaling relations in critical phenomena at d > dc 
[39 . \3T\ — hyperscaling is obeyed when # = 0. In criti- 
cal percolation, the clusters at threshold are non-space- 
filling, and their fractal dimension is D — A for d > dc, 
while dc = 6. The MST, on the other hand, is an exam- 
ple of the subclass of such problems in which the union 
of the clusters fills space, so that D + ^ — d. This then 
suggests that dc — D = 6 is the upper critical dimen- 
sion, below which exponents (such as D) must change 
with d. Below d = 6 it is plausible that there is only a 
single connected component (in the local sense described 
above) with probability one, or at least that, as i? — > 00, 
N{R) < R" for any a > 0. Then D = d and # = 0, and 

r 6 (d> 6), , . 

^ - \ d {d<&)- ^^■^> 

_ / d - 6 (d> 6), , . 

This result contrasts with that of NS, who suggested 
that 4 = 8 for MSTs 0, Q, in the sense that # > for 
d > 8, # = for ci < 8. NS appear to have believed that 
the connected components of the MST have dimension 
4, but their arguments would actually imply that D = 8 
also; their arguments will be discussed in more depth in 



Sec. nil (In fact, they showed that dc < 8.) The ques- 
tion of the number of connected components was raised 
in a construction of a minimum spanning forest in the 
continuum model directly in infinite volume by Aldous 
and Steele 0, who suggested that the forest has a 
single connected component in all dimensions d. It has 
been proved that there is a single connected component 
for this model in d = 2 [s^ , but for larger d there has not 
so far been agreement even at a heuristic level. We note 
that for a different model of random trees on a lattice, 
that of uniform spanning trees (each spanning tree on a 
finite graph is given equal probability), similar behavior 
of the dimensions was proven, except that dc ust — 4 and 
^UST = 4 for d > 4 39]. Hence the universal properties 
of MSTs and uniform spanning trees are distinct, at least 
in sufficiently high dimensions [i^ |4l| . 

In the companion paper II, we formulate a perturba- 
tion expansion for the geometry of the MSF(p) process 
in Euclidean space for p < Pc- This is based on an ex- 
act small-p expansion. It is used to calculate the fractal 
dimension of a path on the MST on a critical percola- 
tion cluster. Here we will mention only that our calcu- 
lations can in principle be extended to other exponents, 
such as those defined in [i^, or carried to higher orders 
in e. They can also be extended to include statistical 
properties that involve the cost of the MSF(pc)- There 
do not appear to be any scaling relations that relate the 
geometric exponents for MSTs or MSF(pc) to those for 
percolation, unlike those found for the costs in , even 
though the critical dimension dc = 6 is the same for all of 
them. In Ref. it was stated that there are different 
critical dimensions for properties involving the costs and 
for those involving only the geometry (6 and 8, respec- 
tively). This remark was based on NS's results for the 
geometry, and is superseded by the present paper. 



C. Structure of the paper 

The remainder of the paper parallels the discussion 
of the preceding section. In Section [Hi we discuss the 
general properties of MSTs that we exploit in our calcu- 
lations, and explain the relation to percolation. We in- 
troduce Prim's and Kruskal's algorithms for computing 
the MST; these are related to invasion percolation and 
ordinary bond percolation, respectively. Similarly, the 
continuum model of MST is related to continuum ver- 
sions of these. The strongly-disordered spin-glass model 
of NS is defined, and we provide a critique of their re- 
sults for MSTs. We also give a scaling argument that 
suggests that the correct answer is dc = 6, D = 6, which 
is confirmed in the following section. In Section IIIIl we 
solve the MST problem statistically on the BL using the 
connection between Kruskal's algorithm and percolation; 
this defines a mean field theory. We also discuss the limit 
that gives MSTs on PWITs. Implications of our results 
for the strongly-disordered spin glass model are discussed 
in the Conclusion. 
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II. MINIMUM SPANNING TREES, 
PERCOLATION, AND 
STRONGLY-DISORDERED SPIN GLASSES 

In this Section, we describe basic properties and tech- 
niques for solving a MST problem. There are two main 
simple algorithms to consider, Prim's and Kruskal's. 
Both are "greedy" in nature, and when the edge costs 
are assumed to be iid random variables with a continuous 
distribution, both arc related [i^ to models of percola- 
tion. Prim's algorithm, which seems to be more popular 
in the physics literature on MSTs, is connected with in- 
vasion percolation. Kruskal's algorithm (also sometimes 
referred to simply as the greedy algorithm) is related to 
bond percolation. As the latter is much more tractable 
from the statistical mechanics point of view, this is the 
approach that we will use. Here we review these con- 
nections and basic properties of percolation. We review 
the strongly-disordered spin glass model of Newman and 
Stein (NS), and critique their argument that the critical 
dimension of the model is = 8. In NS, this critical 
dimension is that dimension d above which there is an 
infinite number of ground states in the thermodynamic 
limit of the model. We give a scaling argument for dc — 6. 



A. Basic properties and algorithms 

A property that is useful in finding the MST of an 
edge- weighted finite graph is the following 2] : Suppose a 
spanning forest is given, that is not necessarily minimum 
cost (some connected components of such a forest may 
consist of a single vertex and no edges). If we choose a 
connected component, and then find the edge e of mini- 
mum cost among all those edges with just one end in that 
component, then among all the spanning trees that con- 
tain the given spanning forest (as a subset of the edges 
of the tree), the one of minimum cost contains the edge 
e. Thus starting from any given forest, one can greed- 
ily add edges of minimum cost (among those that leave 
any one connected component at each stage) and arrive 
at the spanning tree that has minimum cost among all 
those containing the given forest. In particular, starting 
with the forest in which each tree is a single vertex and 
no edges, one can find the MST. This still leaves many 
ways to proceed in selecting the connected component at 
each step. Two particular ways of doing so are of interest 
here: a) the Kruskal or greedy algorithm in which 
at each step one adds the cheapest edge not yet occu- 
pied that does not create a cycle when added to the set 
of occupied edges; b) the Jarnik-Prim-Dijkstra algorithm 
(usually referred to as Prim's), in which one 
starts by selecting a vertex, and then adds the cheap- 
est edge leaving the connected component containing the 
initial vertex at each step. In either case, the algorithm 
stops when a spanning tree is formed, that is when |T^| — 1 
edges are occupied. From the preceding result, both of 
these ultimately produce the MST. (We have neglected 



to specify how to break any ties that arise from edges of 
equal cost, which can lead to non-unique MSTs, as these 
are not of interest in this paper.) The difference between 
the two in terms of the geometry of the connected com- 
ponents or "clusters" is that in Kruskal's algorithm, at a 
typical stage there are several trees containing more than 
one vertex, whereas in Prim's there is always only one. 
Both procedures have similar running times, and can also 
be improved 

The graph and the set of edge costs defines the MST 
via minimization of the total cost, which is the sum of 
the costs of the occupied edges as in eq. (jl.ll) . From the 
above algorithms, it is clear that the precise values of the 
costs are not important in finding the geometry of the 
MST (and hence neither is the probability distribution 
for them 26]). Only the rank ordering of the set of costs 
is important. Indeed, even this much information is not 
required, as evidenced by the fact that the most efficient 
known deterministic algorithm for computing the MST 
on an arbitrary graph [43| has significantly faster asymp- 
totic running time than that for sorting all the edges of 
that graph by cost. 

B. Relation of Kruskal's algorithm with bond 
percolation 

The standard way to define bond (or Bernoulli) perco- 
lation on a graph is to declare that edges are either occu- 
pied with probability p or not occupied, independently. 
We are then interested in the geometric properties of the 
clusters (connected components) formed by the occupied 
edges. For a graph that is a portion of a lattice in Eu- 
clidean space, the basic problem is to find the percolation 
threshold, the value of p above which there exists a path 
on a single cluster from one side of the system to the 
other. When the graph is the complete graph, the perco- 
lation problem becomes the theory of "random graphs" 
(these graphs being the clusters). The threshold is then 
the value of p at which a cluster of size of order \V\ oc- 
curs, as \V\ oo. In both cases, the behavior near the 
threshold is a critical phenomenon, and there are criti- 
cal exponents. The complete graph model is one popular 
way of obtaining a mean-field treatment of percolation. 

A random edge-weighted graph gives rise to a sample 
of bond percolation, as follows. We suppose that the 
weights or costs on the edges are iid random variables, 
with a probability density P(£e) of the cost of any one 
edge e. If we say that all edges with costs less than or 
equal to io are occupied, then this gives a sample of bond 
percolation in which p = po is given by 

rf-o 

PO - / d£eP{ie)- (II.l) 

In particular, if P{£e) = 1 for £e in the interval [0, 1] and 
zero outside, then po — £q. 

Now consider Kruskal's algorithm for such a random 
edge-weighted graph. At each step, one must "test" the 
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edges not tested earlier, and find the cheapest one that 
does not create a cycle when added to those occupied. 
We say that such an edge is accepted. Thus we can view 
the algorithm as working through the edges in order of 
increasing cost. If the algorithm is terminated at cost £o 
(at which stage there may be only a spanning forest, not 
yet a spanning tree) , then the set of edges already tested 
forms a sample of percolation, as just described. Thus, 
if we had accepted all edges instead of only those that 
did not form a cycle, we would have obtained a sample 
of bond percolation. We can thus view percolation, as 
well as Kruskal's algorithm, as a dynamical process in 
which clusters are grown beginning from the set of ver- 
tices and no occupied edges, and eventually obtaining a 
cluster spanning the graph. Kruskal's algorithm gives 
a variant on this percolation process in which as £q in- 
creases, an edge is accepted only if adding it does not 
form a cycle [44|. The process does not depend on the 
probability density chosen for the costs, because as men- 
tioned above only the ordering is important for the tree. 
The process is fully characterized by the variable po , and 
distinct probability densities P{£e) give rise to the same 
probability measure on MSTs, as emphasized in Ref. [2^ . 
Without loss of generality, we can consider £e to be uni- 
formly distributed between and 1, so po = £o, as above. 
We will use the name MSF(po) for the random spanning 
forest produced by halting the Kruskal algorithm at some 
value Po; the true MST is obtained at po = Ij or by halt- 
ing when \V\ — 1 edges are occupied. 

Later in the paper we develop analytical techniques to 
study MSTs using the theory of bond percolation. Here 
we mention some of the basic properties of percolation, 
for the limit of an infinite system [sHi . The percolation 
threshold Pc is non-universal; it depends on the lattice or 
class of graphs considered. For hypercubic graphs in di- 
mension d (i.e. Z*^, with edges connecting nearest neigh- 
bors only), Pc lies strictly between and 1 for d > 1, 
while for d = 1, Pc = 1- For p < pc, there are finite 
clusters only. The typical size ^ of a cluster (the corre- 
lation length) diverges as f ~ [pc — p)^'^^'--"' as p ^ pc 
from below. For p > pc (so assuming d > 1), there 
is a single infinite cluster with probability one, and for 
Pc < p < 1 a non-zero density of finite clusters (clusters 
not connected to infinity). The typical size of the finite 
clusters now defines ^, and ^ ~ (p — Pc)~'''"'" a.s p Pc 
from above, with the same exponent Vperc- The corre- 
lation length exponent J/perc equals 1/2 for d > 6, but 
deviates from this value for d < 6. At criticality, p = Pc, 
there is a power law distribution of cluster sizes up to 
infinite size. For d > 6 these clusters have fractal dimen- 
sion Dpcrc = 4, while this dimension I?pcrc deviates from 
this value below d = 6. For d > 6, there are of order 
/j#porc large such clusters intersecting a ball of radius R, 
where #porc — d — 6, while #pcrc = for d < 6. All 
the exponents may be calculated by considering suitable 
correlation functions. (All power laws may be subject 
to logarithmic corrections when d = 6, which we do not 
consider.) 



C. Relation of Prim's algorithm with invasion 
percolation 

The arguments of the previous subsection can be re- 
peated for Prim's algorithm. From Prim's algorithm, one 
obtains a dynamical process in which at each step a single 
edge is added to growing cluster (tree), which contains 
the initial vertex x. The edge added is the cheapest one 
that borders the current tree and does not form a cy- 
cle. In a finite system, this cluster eventually spans all 
the vertices, and at that point becomes the MST. In an 
infinite system, if the "time" in the process is identified 
with the number of edges added, then the tree continues 
to grow indefinitely until an infinite tree is obtained af- 
ter infinite time. We should note that it is not obvious 
that this infinite tree is spanning, and for d > 1 it will 
not be. Because the tree depends on the starting vertex 
x say, we denote this tree obtained after infinite time as 
Too(x); it is a random object that depends on the costs 
of all the edges that were tested in growing the tree. The 
latter edges are those on the tree, together with those 
that have at least one end connected to x by Too (x) . 

Prim's algorithm is connected to a typ e of percola- 
tion called invasion percolation [i^, |4^, \^ analogously 
to how Kruskal's is connected to bond percolation; for 
Prim's algorithm this connection has been noted repeat- 
edly in the physics literature d, 0, i, i, [H, [111. Us- 
ing the edge-weighted graph with iid costs as before, a 
model of invasion percolation is obtained by modifying 
Prim's algorithm to accept the cheapest edge that bor- 
ders but is not on the current cluster (i.e. by neglecting 
the no-cycle condition). Notice that in both the Prim 
and invasion process, the costs of the accepted edges do 
not increase monotonically as they did in the case of the 
Kruskal/percolation process, because when the cluster 
first enters a region of space, additional edges become 
available, which may be cheaper than those accepted ear- 
lier. Indeed, in an infinite system, the edges accepted 
after long times have (with high probability) costs < Pc 
of the corresponding bond percolation problem (in the 
model with costs uniformly distributed between and 1) 
m m m El- n is beUeved [H EE [13, gl that the 
invasion cluster is a fractal very similar to the critical 
percolation clusters, with the same universal properties, 
so its fractal dimension is Dinv — Dp^rc = 4 for d > 6. 
Clearly, the set of vertices on the invasion cluster and on 
the invasion tree produced by Prim's algorithm are the 
same, so the same fractal dimensions apply to the trees 

roo(x). 



D. Strongly-disordered model of a spin glass 

NS [1, @ (see also Ref. [1, 0]) defined a strongly- 
disordered spin glass model and showed that the problem 
of finding the ground state maps onto a MST problem. 
We now briefly describe this model. In the next subsec- 
tion, we discuss the arguments of NS, which motivated 
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many of the questions addressed in this paper. 

The Edwards-Anderson (EA) Ising spin glass model 
[2^ is defined by the Hamiltonian 

n^-Y,J^JS^SJ (II.2) 

i<3 

for Ising spins s,; — ±1, where Jij are quenched random 
variables. The positions i, j are taken as lattice points 
in a portion (say, a cube of side L) of a d-dimensional 
hypercubic lattice, and for the edges {ij) of the corre- 
sponding graph (i.e. "nearest neighbor bonds"), the Jy's 
are iid variables with a distribution independent of the 
portion of the lattice chosen (in particular, independent 
of the number \V\ — L'^ of spins), for example a Gaus- 
sian distribution with mean zero and standard deviation 
Jq. The Jy's are quenched random variables, meaning 
that thermodynamic quantities must be calculated with a 
fixed sample of J^'s, and then averages (or moments, etc) 
taken at the end. The NS strongly-disordered model dif- 
fers from the EA model in the distribution of J^s. While 
they are still iid, the width of the distribution is assumed 
to be extremely large, and depends on the number of 
spins, in a fashion to be specified below. The distribu- 
tion is symmetric, so that the sign of Jij, eij = sgn Jy, is 
±1 with probability 1/2 for either case, independently of 
the random magnitude Kij — \Jij\. NS focus on ground 
state properties, and specify a boundary condition that 
the spins on the boundary are an arbitrary set of val- 
ues ±1, chosen independently of the J^s on the edges 
in the interior. Clearly, similar models can be defined 
on other graphs, including the complete graph (infinite- 
range model), or with different boundary conditions. 

The central idea in the use of a broad distribution of 
disorder (that is, of the Kij's) is that for such a broad 
distribution, for any subset of edges, there is always a 
single Kij that dominates all others in the set, or even 
dominates the sum of all the others. Thus the width 
of the distribution must simply be chosen large enough 
that this is so, and that is why the width must increase 
with the system size. In this case the problem of finding 
the ground state of the model with given bonds Jij and 
boundary spin values may be solved by a greedy pro- 
cedure. NS chose to use a procedure similar to Prim's 
algorithm. To find the orientation (i.e. the value ±1) in 
the ground state of a given spin located at x, first find 
the largest Kij among the edges leaving x. The relative 
orientation, SiSj for i corresponding to x, is clearly then 
determined, but not Si itself. Then find the largest Kij 
leaving this cluster, that is with one end on the clus- 
ter, but not the other. This fixes the relative orientation 
with a further spin fc, say. This process can be repeated, 
adding edges to the cluster until a boundary spin is en- 
countered, at which point all the spins on the cluster are 
now determined. Edges that connect two sites of a clus- 
ter that are already connected need not be considered (or 
accepted), since the relative orientation of those spins is 
already determined. Thus the edges accepted form a tree. 
The process is clearly identical to Prim's algorithm, un- 



til the growing invasion tree touches the boundary. The 
process can then be repeated starting with any spin not 
already fixed, until all spins have been determined. For 
the trees after the first, the process is defined to restart 
from another vertex not already connected if the growing 
tree encounters an earlier tree, as well as if it encounters 
the boundary, since again such an encounter fixes the 
spin orientations on that growing tree. 

The process of repeatedly growing trees until every ver- 
tex is part of a tree that touches the boundary exactly 
once produces a spanning forest on the graph (or portion 
of the lattice). From the percolation point of view, simi- 
lar use of boundary conditions is called a wired boundary 
condition, and has also been used in MST problems [H]- 
We may imagine that the edges that connect the bound- 
ary vertices to one another have costs less than all those 
in the interior, so in Prim's algorithm, once the first tree 
encounters the boundary, it is immediately connected to 
all other boundary sites along the boundary. (The pre- 
cise ordering of the boundary costs among themselves 
is not of interest, because we are not interested in the 
edges of the MST on the boundary.) Then as all earlier 
trees are now viewed as connected in a single tree that 
includes all the boundary vertices, the result when the 
process terminates is a spanning tree. It is in fact the 
MST on the given graph with costs —Kij, together with 
costs ^ —00 for the boundary edges. The process used 
by NS is a variant on the Prim process that occasion- 
ally restarts from a different vertex that is not connected 
to any others. This is among the many different ways 
to find the same MST, as can be seen using the general 
fact from Sec. Ill Al Hence any valid construction of the 
MST with this wired boundary condition produces the 
same spanning tree, and hence also produces the same 
ground state of the NS model. In particular, we can use 
the Kruskal algorithm. 

Thus finding the ground state of the NS model is a 
MST problem. Once the MST with the costs -Kij in the 
interior, and wired boundary conditions, has been found, 
the spin orientations follow using the signs and the 
boundary spins. Because this is now essentially solving a 
spin-glass model on a forest, there is effectively no frus- 
tration left at this stage. We recall that frustration in 
an Ising spin model with Hamiltonian of the form of Ti. 
means that there exist cycles of the graph such that the 
product of signs of the edges on the cycle is negative; 
generally this means that all the details of the magni- 
tudes of the couplings must be studied in order to find 
the ground state. For the EA model in more than 2 
dimensions, finding the ground state is computationally 
costly. Indeed, in c? > 2 the problem of determining 
whether the ground state energy (cost) of an Ising spin 
glass Hamiltonian H is less than some bound (budget) is 
NP-complete 0j, and so presumably cannot be solved 
in polynomial time (for a discussion of NP-completeness, 
see Ref. [lOl)- (For c? = 2, or for planar graphs, the spin 
glass ground state can be mapped to a network flow prob- 
lem, and solved in polynomial time.) By contrast, for the 
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NS model, the ground state can be found in polynomial 
time using an MST algorithm. 



E. Ground states of the NS model and fractal 
dimensions 

We now turn to NS's analysis of their model. The 
model was constructed so as to be soluble. The use of 
fixed boundary spins, or the wired boundary condition 
on the MST, was motivated by a deep view of the mean- 
ing of a thermodynamic state in a spin system, or in 
the present case a ground state 0, [^, Isij ] . In an infinite 
system, a ground state can be defined as a spin configu- 
ration the energy of which cannot be lowered by revers- 
ing the values of any finite set of spins. A ground state 
spin configuration in any bounded portion of the lattice 
(such a configuration is simply that of minimum energy) 
is completely determined by the values of the spins on 
its boundaries. As the size of this portion goes to in- 
finity (keeping the bonds Jij the same in the interior as 
fresh bonds and spins are added at the boundary), there 
should exist sequences of boundary conditions such that 
the spin configuration seen in any "window" (subregion of 
the system) converges to a limit. When this is done for a 
sequence of windows diverging in size (such that the spin 
configurations agree where the windows overlap), then a 
ground state of the infinite system is obtained. It is then 
clear that this ground state is determined by a choice 
of boundary conditions infinitely far away. The use of 
arbitrary boundary spins on the boundary of a finite re- 
gion (a hypercube of side L, say) as in NS is then part of 
this process, and approximates the ground states of the 
thermodynamic limit. 

As we have seen, NS determine the ground state of 
their model, using boundary spin values and a set of in- 
vasion trees. Then the spins in the box lie on a spanning 
forest in which each connected component touches the 
boundary just once. All spins on such a connected com- 
ponent will be reversed if the value of the correspond- 
ing boundary spin is reversed. Thus the logarithm of the 
number of possible distinct ground states obtained inside 
the box of size L by varying the boundary spin values is 
bounded by C'(L'^~^ln2) (it is a bound, not the actual 
number, because some boundary spins may not be con- 
nected to any interior spins, and these boundary spins 
are not to be considered when counting configurations). 
But it is better to consider a window of side W within the 
box, with W <^ L, and ask how many distinct configura- 
tions can be obtained within the window as the boundary 
conditions are varied, preferably without counting con- 
figurations that differ only just inside the surface of the 
window. The logarithm of this number is given by the 
number of connected components of the minimum span- 
ning forest that intersect the window (neglecting those 
whose linear size is less than W/2, say). (There may be 
some ambiguity here concerning connected components 
that intersect the window more than once, and are con- 



nected outside the window but not inside, as these do not 
give 4 ground states, but only 2, however we will assume 
this is not significant.) Thus this number \nJ^{W) is the 
same as N{W) as defined in Sec. ID 

\nJ\f{W) = N{W)ln2. (II.3) 

We have arrived at the same question about MSTs that 
was already introduced in Sec. lU the behavior of the 
number of connected components of a MST intersecting 
a window, or alternatively the fractal dimension of the 
connected component (s) of an infinite MST. 

NS studied this question using the Prim algorithm. To 
find whether two vertices, x, y, say, are on the same con- 
nected component, we may grow the invasion tree from 
each of them, and see whether they intersect. More ex- 
actly, we could first grow the tree from x to infinity, then 
begin again from y, stopping if roo(x) is encountered. 
Since each invasion tree is a fractal of dimension Di^v = 4 
(for d > 6), NS were able to show rigorously that when 
d > 2Dinv — 8, there is a non-zero probability that the 
two trees "miss" and never intersect jl, @ . This event im- 
plies that the two points are on distinct connected com- 
ponents, and so if the critical dimension dc is defined as 
that above which the MST has more than one connected 
component in the thermodynamic limit (with probability 
one), then NS proved the upper bound dc < 8. 

To determine the actual value of dc, NS stated that 
they need a converse result, in other words a lower bound. 
This converse statement would result if the probabil- 
ity that the two invasion trees do intersect is of order 
1 for d < 8. Such behavior is natural if one has two 
independently-grown fractals of dimension 4 (or more 
generally, if 2D > d for two fractals of dimension D). 
Now for the invasion trees, independence does hold when 
the clusters are sufficiently far apart, because the edges 
considered when growing either one are only those bor- 
dering it, not all those in the system. (We recall that 
the edges bordering the tree are those with one end on 
the tree, and one end off it.) More precisely, they are 
independent as long as they do not intersect and the sets 
of edges bordering one are disjoint from the set of edges 
bordering the other. This leads to NS's result dc < 8. 
However, if the trees do become close together so that 
they share one or more bordering edges, for example if 
roo(x) is already grown first, and we are growing the 
tree from y ^ Too (x) , then they are correlated. From the 
invasion process, we can see that edges bordering Too (x) 
must be of cost higher than pc (again, in the model where 
edge costs i ot p are uniform between and 1), and not 
arbitrary like those bordering the tree from y when they 
are first encountered. This very strong correlation effect 
reduces the probability that any such edge that forms the 
connection between roo(x) and the tree from y will be 
accepted on Too(y). Hence we expect that dc < 8. 

More quantitatively, as the tree Too(x) has dimension 
-Dinv = 4 for d > 6, the probability G(z — x) that a point 
z will be on T'oo(x) should behave as 

G(z-x) - |z-x|^--'^ (H.4) 
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for large |z — x| [1, 0, H^. If the two trees were grown 
independently, the probability that they intersect would 
behave as 

J d'^z G(z - x)G(z - y) ^ |x - ypAnv-d. (n.5) 

For d < 2Dinv{d) = 8 this is instead of order one. This 
would give the converse result of NS. It would also im- 
ply that connected components of the minimum span- 
ning forest have dimension at least 2Z?inv = 8 for d > 8 
(though NS seem to have believed that this dimension 
would be 4). But because the trees are not independent 
when they intersect, neither result is correct. 

To make a crude estimate of the correct result, we use 
scaling arguments for the growth of Too (y) • Given x and 
y, the tree from y first approaches close to T^{x.) when 
its linear size ^ is of order |x — y|. The costs of edges typ- 
ically accepted when it reaches a size scale ^ will be dis- 
tributed up to about Pc, with p — pc not larger than of or- 
der ^-i/'^po"^ where we recall that fporc — 1/2 for d> 6. 
(Here p stands for the cost of the edge, in accordance with 
the earlier discussion of the relation of MST with perco- 
lation.) Indeed, the probability density for an edge to be 
accepted at this stage will be given by a scaling function 
of {p~ Pc)^.^^^'^^"'' , which goes to one when its argument 
is large and negative, and to zero when its argument is 
large and positive. Similarly, the costs of edges bordering 
Too (x) at distance of order ^ from x have a probability 
distribution that is a function of {p — pc)S,~^^'^p''" , which 
vanishes for p < pc and goes to one for large positive 
arguments. Multiplying these functions and integrating 
over p to obtain the probability that the first such edge 
is accepted, the result is of order ^^i/^p^c = |x — y|^^ 
for d > 6. Assuming that consideration of subsequent 
events, which lead to smaller probabilities as they occur 
on larger scales, leads to a converging sum of terms, this 
then leads to the reduction of the dimensions by 2, so 
dc = 6 and D = 6 for c? > 6 (where, throughout this 
paper, D is the dimension of the connected components 
of the MST). In the next section, we obtain the same 
results by much more secure methods on the BL, and a 
plausible application to Euclidean space also. 

The behavior of the probability that two vertices are 
on the same connected component of the spanning for- 
est is exactly what is needed to count the significantly- 
different ground states of the NS model that are visible in 
a window. The number of large connected components 
intersecting the window will scale as W'^~^, which by 
the above scaling argument, and the results below, will 
be W'^-^ for d > 6; thus # d - 6 (thus the use of 
fporc(d) = 1/2 in the above scaling argument is justified 
self-consistently). We mention here that the authors of 
Ref. [2^ , who considered the path exponent Dp but not 
D or # for MSTs, also stated that the critical dimension 
for MSTs is 6, apparently because they were using Prim's 
algorithm, and its connection to invasion percolation, for 
which again dc = 6. 



III. BETHE LATTICE 

In this section we define and solve the MST problem 
on the BL with wired boundary conditions. The BL can 
be motivated by the desire to find a "mean field" theory, 
in which only the mean effect of neighbors of a vertex 
is included, while effects of correlations that propagate 
around cycles of the lattice are neglected. The BL fulfils 
these requirements as it possesses no cycles, but is still 
homogeneous due to the constant coordination number 
(degree). For the MST on the BL, we calculate the ex- 
pectation value of the number of vertices that are con- 
nected to the origin within radius m. We show how to 
define and analyze other correlation functions also. The 
BL solution forms the basis for a mean-field theory de- 
fined on a finite-dimensional lattice in the next section. 
The BL results imply that the mean-field fractal dimen- 
sion of paths on the MST in Euclidean space is 2, and 
the fractal dimension of connected components is 6. This 
establishes that dc = 6 by the argument given above. 



A. Preliminaries 

The BL is a tree of degree z — a + 1 at every vertex, 
except for the "leaves" at the boundary, which are all 
a "distance" (distance means minimum number of steps 
along the tree required to go from one vertex to another) 
M from the central vertex, which we call the origin. We 
use the term BL interchangeably to refer to both a finite 
graph of radius M and the graph obtained in the Af ^ 00 
limit. Since the MST on a finite tree graph will clearly be 
the whole tree, we must use wired boundary conditions, 
as discussed in the previous section, in order to obtain 
an interesting spanning forest. We will see that the limit 
of such a forest does exist, at least in terms of its local 
statistical properties, at finite distances from the origin 
(hence, far inside the boundary); this is the "weak limit" 
which has been discussed in Ref. [4l[ . We expect that the 
local properties of this forest in the limit mimic the local 
properties of the MST in Euclidean space in a mean-field 
sense. 

The wired boundary condition means that we connect 
all vertices on the boundary of the BL to each other with 
additional "boundary edges" of vanishing cost, so that 
they are always connected for po > 0, as shown in figure 
[T] In practical terms, this means that when we run the 
Kruskal process, once a connected component (or cluster) 
of the forest touches the boundary, no edge that would 
connect it to the boundary by a distinct path can be 
added later. When the process is run up to po = 1, we 
obtain a forest spanning the BL, which becomes a tree if 
the boundary edges are included. 

We will use the correspondence with bond percolation 
discussed in section |TT1 Percolation on the BL has been 
studied thoroughly 36, 51]. The most basic question of 
interest is to find the probability that a vertex in the in- 
terior is connected to the boundary, as a function of the 
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FIG. 1: A sample realization of the MST (solid lines) on the 
BL with wired boundary condition. The graph shown has 
(T = 2 and radius M = 5. Vertices on the boundary are 
connected with boundary edges of vanishing cost, shown here 
as thin lines. All vertices in the interior are connected to the 
boundary only once. 

probability p for each edge to be (independently) occu- 
pied. A convenient quantity to look at is the probability 
Fm{p) that a given vertex is not connected to the bound- 
ary at distance M via a path of edges through a given 
outward branch. This obeys the recurrence relation 

Fm+i{p) = 1-p + pFm{p)\ (m.i) 

with Fq{p) = 0. This arises because either a) the first 
edge along the branch is unoccupied, which occurs with 
probability 1 — p, or b) the first edge is occupied (with 
probability p) and there is no connection to the bound- 
ary through the a branches further towards the bound- 
ary; these two alternatives are disjoint. As the radius 
M of the lattice goes to oo, all vertices a fixed distance 
from the origin are far from the boundary, and the prob- 
ability of any of them not being connected to the bound- 
ary along a particular outward branch approaches a limit 
limjv/— »oo Fm{p) = F{p), which is given by a stable fixed 
point with < F < 1 of the recurrence, that is 

F{p) = l- p + pF{py. (in.2) 

This has the trivial solution F = I, and its stability is 
given by linearizing the recurrence about this solution; 
the eigenvalue of the linearized recurrence for Fm — 1 is 
ap. Thus for p < 1/a the solution F = 1 is stable. In 
this regime, no interior vertex is connected to infinity, 
with probability one. The value pc = 1/cr is called the 
percolation threshold. For p > pc, the solution F = 1 
is unstable, and the stable solution is a non-trivial solu- 
tion to the fixed-point condition, eq. pil.2[) . The exis- 
tence, uniqueness, and stability of, and the convergence 



to, these fixed points is proved in, for example, Ref. [S^l 
(in different notation). For cr = 2, this solution can be 
found explicitly, 

F[p) = {l-p)/p (IIL3) 

for p > 1/2 (cr = 2). For general cr, we can expand in 
powers of p — Pc, and find that 1 — F{p) = a{p — pc) + 
0{{p—pcY) asp ^ Pc from above, where a — 2g/{<7 — 1). 
(As p ^ 1, F{p) 0.) The probability that any vertex is 
connected to infinity (and hence is on an infinite spanning 
cluster) is then Poo(p) = 1 ^ F°''^^{p), which is zero for 
p < Pc, and turns on with a discontinuous derivative at 
Pc- This behavior defines the critical exponent P — 1 
via Poo{p) ^ [p ~ Pc)^ for p ^ pc from above, which 
illustrates that percolation is a critical phenomenon with 
discontinuous properties at Pc- 



B. Correlation functions 

In using Kruskal's algorithm to calculate the MST on 
the BL, we need to keep track of when different vertices 
become connected to the boundary as po is raised from 
to 1. As an example, consider the probability P(Q-rn){Po) 
that the path on the MST from the origin to the bound- 
ary has formed by time po-, that is when all edges of cost 
< Po have been tested, and passes through a given ver- 
tex a distance m from the origin. (When we discuss such 
paths, we always mean a path that does not backtrack 
on itself.) We may consider this as done in a finite sys- 
tem, and we always assume that the limit M — > c» is 
unproblematic. Indeed, we will see that the only ingre- 
dient involving properties all the way out to the bound- 
ary is just the probability Fm{p) from the percolation 
problem, which is already known to have the appropri- 
ate limit. Note that P[o;m){Po) can also be described as 
the probability that on the MSF(po) on the infinite BL 
there is a path to infinity passing through two specified 
vertices at separation m. 

For later convenience, let us label the vertices so that 
m is the origin, and is the vertex at distance m that 
the path on the MST passes through. To begin with, 
consider m > 0. For every vertex j (j = 1, . . . , m — 1) 
along the path from the origin to our chosen vertex, there 
is a set of paths on the BL connecting j to the boundary 
through any of the cr — 1 side branches leaving the path 
from to m. We define ij^oo as the minimum over this 
set of the maximum edge cost encountered along each of 
these paths (see Fig. [2]). In the model with £ for each 
edge uniformly distributed in [0, 1], ij,oo is the value of p 
at which a connection to infinity is first formed along any 
of these cr — 1 side branches from vertex j in the bond 
percolation process. Similarly, for j — {j — m), define 
^0,00 (^m,oo) as the minimum cost at which connection to 
infinity is formed along any of the cr branches other than 
the path to m (0) in the percolation process. Then the 
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Boundary 

FIG. 2: Construction of a path on the BL starting from the 
origin (labeled m), passing through the vertex labeled 0, and 
connecting to the boundary, here shown as a shaded line. For 
this path to lie on the MST, the edge costs must obey the set 
of inequalities given in eq. (|III.4|) . Note that each vertical line 
(both solid and dashed) stands for a subtree, not just a single 
edge, so this diagram depicts the entire BL. 



desired probability can be written as 



P{0,m){p) 



Pr 



j=l i=0 



3=0 



(III.4) 



where A denotes a logical 'and' and we simplify the no- 
tation by defining £-ifl = io,oc- Notice that in forming 
the minimum cost tree connected to infinity, it is im- 
material whether £i-iA > (j,oo for i > j, provided that 
^i',oo > 4-1, i for j' > i. This is the crucial fact in setting 
up a recurrence relation for P(o_m)(p). Finally for m — 0, 
we must make a special definition. The most natural is 
that no direction for the path is specified, and P(o.o)(p) 
is defined to be P(o.o)(p) = -Poo(p) = 1 - F{py+\ 

If the number of side branches from vertex m (m > 0) 
was the same as those at vertices j = 1, ...,to— 1, namely 
(T — 1, the probability P(o,m)(p) would obey a recurrence 
relation. Let us define P^^ to be defined the identi- 

cal way with this modification. (This corresponds to an 
alternative definition of the BL that is sometimes used, 
in which the origin alone in the tree has degree a; it is 
used because it simplifies the use of recurrence relations 
in a similar way as here.) Then eq. pil.4[) can be explic- 
itly constructed as a recurrence relation, most easily by 
working with the derivatives 



The initial condition for the recurrence is 

d 



^o(p)-^(l-W). 



(III.5) 



(III.6) 



The costs ij^oo for J = 0, . . . , m, and for j — 1, . . . , 

m are all statistically independent (note that im.oo now 
refers to a collection of tr — 1 branches, like the others). 
The probability that £j oo < p is 1 — F°'~^{p) for j — 
1, . . . , m. The probability that ij-i.j < p is p. The 



recurrence is then given by 

^jip) = ^^(p)"-^ (^p<i>,-i(p) + ^V<f,-i(p')' 

= F{pY-^^^[p j^p' <^,^,{p')^ . (IIL7) 

The factor of F'^^^{p) is the probability that the a — 1 
side branches at vertex j are not connected to infinity 
by p. In the bracket in the first line of eq. (jlll.Tp . the 
two terms correspond to the two cases that the most ex- 
pensive edge connecting the vertex j to infinity on the 
path, which has cost p, either is not or is the edge con- 
necting vertices j — 1 and j, respectively. In the first case 
the latter edge is already occupied (with probability p) , 
and connection occurs at p somewhere in the remainder 
of path, with probability density described by <^j-i{p), 
while in the second the edge j — 1, j is the one that be- 
comes occupied at p, and the probability must be mul- 
tiplied by the probability P^'^ ^_^{p) that the rest of the 
path is already formed. We write the recurrence in terms 
of a kernel K: 



K{p,p') 



[ dp' K{p,p')^,^,{p'); {IIU 
Jo 



F{pr-'[e{p-p')+pS{p-p')] 
F{pr-'-^[peip-p')], 



(III.9) 



where 9{x) is the usual step function. 

The kernel K{p,p') in the recurrence has the meaning 
of a conditional probability density. It is the probability 
density (in the p variable) that j is first connected to 
infinity along the specified path passing through j — 1 at 
value p (and is not connected to infinity along any side 
branches) , given that j — 1 is first connected to infinity in 
the specified manner at value p' . It is clear that this must 
vanish if p < p' . We note that if the factors F'^"^ are 
omitted, then we obtain the corresponding conditional 
probability relevant to percolation, in which connections 
to infinity along side branches are allowed, instead of that 
relevant to MSTs. Similar interpretations apply to the 
iterates of K that we consider next. 

The recurrence relation (|III.7|1 . and its generalizations, 
allow us to compute correlation functions on the BL. It- 
eration of the recurrence requires iterated integrals of K, 

i^™(pi,p™+i) 



dp2 



dpm K{pi,p2) ■ ■ ■K{p 

(III.IO) 



The explicit step and 5- functions in each factor of K 
guarantee that pi > P2 > • • • > Pm+i- From the defini- 
tion we then have 
rPo 

P(0;m)(P0)-/ dpdp' Fip)K*"'ip,p')Mp'), (in.U) 

Jo 
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where the factor F{p) in the final integration restores the 
correct number of branches at the origin. 

We can obtain multipoint correlation functions by in- 
troducing branching into the path via additional chains 



J 



of Ks. For example, the probability that two points at 
distances mi > 0, m2 > from a vertex that we label 
are both connected to infinity through by time p = po 
is given by (see Fig. ^ 



fPo 

/ dpi dp2 dp[ dp'^ F{pi)F{p2)K*'^' {pi,p'i)K* 
Jo 



'(P2,P2)*(0;0,0)(pi,P2) 



(III.12) 



m 



mi 



m2 



Pi 



P'l 



P'2 



P2 



^55555^ ^55555^ ^5^55^ ^5^55^ ^ 

Boundary 

FIG. 3: The set of paths contributing to -P{o;mi,m2)(Po) 
drawn using the same conventions as figure[l ForC'^^(m;po) 
we sum over all mi > 0, m2 > such that mi + m2 = m. 



The initial distribution is 



*(0;0,0)(P,P') =S{P' 



'{p)), (in.i3) 



because the connection of to infinity must not use either 
of the two side branches leading to mi and m2- We 
then define the two-point correlation function C^^^ (m, pq) 
as the probability that two vertices at separation m are 
connected to each other and to infinity by po- For m > 0, 
it is given by 

m— 1 

;m' ,m — rn') 

m' — l 

(in.i4) 

The term 2P(o;m)(?'o) covers the cases where the path to 
infinity on the tree from one vertex passes through the 
other. For m = 0, we define C(^'(0;po) = P{0;0){Po) 
Poo{po) This construction of the two-point correlation 
function can be immediately extended to multipoint cor- 
relations, giving the probability that, by time po, several 
specified vertices are on the same component of MSF(po) 
and connected to infinity. These will be used in section 
IIIIEI to analyze the geometry of the trees. 



C. Analysis of the iterated kernel 



To make further progress, we need to analyze the be- 



First, for Pf 



(Po) 



havior of the iterated kernel K* 

there is a simple result: substituting pil.6p into (jIII.7|) 
and using pil.2p shows that ^i{p) = $o(p)/cr. That is. 



$0 is an right eigenfunction of K with eigenvalue 1/a. 
Then = $o(p)/ct^ for j = 1, . . . , i 



finally we obtain 



P{0,m){Po) 



{a + l)cr'"-l (cr + l)cr"' 



1. Then 



(in.15) 



In particular, if we put po = 1 so that all edges have been 
tested, then P(o,,„)(l) = cr"(""^V(f^ + !)• The meaning 
of these results should be clear: due to the isotropy of 
the BL, the path from the origin to infinity [which exists 
at Pq with probability Poo(po)] must pass through one 
of the {a + 1)(T™~^ vertices a distance m away, and all 
are equally probable. Thus the path is an isotropic ran- 
dom walk on the BL (with no backtracking). In section 
IIIIFi we will interpret this result in Euclidean space as 
a random walk with fractal dimension Dp = 2. 

All the right eigenfunctions of K can also be ob- 
tained. We will require them to be integrable, in par- 
ticular at p — 0. The eigenvalue equation for 
/p dp2 K(jpi,p2)v\(p2) = \v\(pi), becomes a differential 
equation for V\{p) = dp' vx{p'), 

{\-pF{pr-')j-Vx{p) = F{pr-^Vx{p). (III.16) 

Locally in p, the general solution of the differential equa- 
tion piI.16|l is 



Va(p) = Cexp / dp' 



Fip'Y 



A - p'F{p'Y- 



(III.17) 



(C is a constant). From the definition, Vx must obey 
the initial condition ^^(0) = 0, and be continuous at all 
p. Excluding the solution Vx{p) identically zero on [0, 1], 
the continuity of Vx{p) implies that Va(0) = is only 
possible if A obeys A = pF{pY~^ for some p ^ px > Pc, 
and Vx(j>) — for p < px (the solution for px > Pc to 
A = pF{pY^^is unique). This means that A lies in the 
interval (0, l/d]. As px is approached from above, these 
solutions have a power law behavior. 



Vxip) ^ {p-p\) 



ax+l 



(III.18) 



as p ^ Px from above. Using the behavior of F{p) near 
Pc, we find ax ^ —2a{px — Pc) as px Pc from above. 
Also, q;a ^ — 1 as Pa ^ 1 (''^ — > 0), and ax always lies 
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between and —1. Thus the eigenfunctions v\ are non- 
negative and integrable for all A G (0, 1/cr]. For A = l/u, 
P\=Pc = l/c, and vi/a{p) oc $o(p)- 

To make progress with higher-order correlation func- 
tions such as (jIII.12[) . we first introduce the generating 
fimction (a discrete Fourier-Laplace transform), for com- 
plex w, 



=Xl"'^*J(P)■ 



(III.19) 

Inserting this into (jllLSp and summing over j yields 



dp' S{p-p')~wK{p,p') = $o(p). (111.20) 



In order to solve this integral equation, we find the resol- 
vent operator (or Green function) such that 



dp2 



SiPl -P2) - wK{pi,p2) 9wiP2,P3) = Sipi ~Pz)- 

(III.21) 



The formal solution of pil.2ip is simply 



gn,{pi,P2) = 5{p^ -P2) +5I^'^*'(P1'P2). (111.22) 

J = l 

Because the set of eigenvalues of K is the interval (0, 1 /a] , 
the series cannot be expected to converge if \w\ > a. This 
will be directly confirmed later. From the structure of K, 
we expect in general that is zero for pi < p2- Because 
of the (5-function term in JsT, (7^, contains a (5-function term 
as well as a smooth piece. For |w| < a, the (5-function 
part is clearly gw{pi,P2) = (1 - wpiF{pi)''-'^y^S{pi - 
P2) + ■ . where the omitted parts are ordinary functions. 
For w real and in the interval [a, 00), the coefficient of the 
(5-function blows up at pi = p2 = 1 /w and at pi = P2 = 
Pi/w, in terms of the values p\ related to the eigenvalues 
A of -fC as above. 



Equation pil.2ip can be converted into a differen- 
tial equation by defining 6*^(^2, Ps) = /J"^ dp'2 gwiP2^P3); 
then 



{1 - wpiF''-\pi))—G^{pi,P2) - wF''-^{pi)G^{pi,P2) ^ S{pi - P2). 
dpi 



(III. 23) 



r 



For pi < pc we have F{pi) = 1, and eq. pil.23p is solved 
[using the initial condition Gw{0,P2) = 0] by 



Gni{pi,P2) 



1 — wpi 



(III.24) 



which vanishes if p2 > Pc- 

The full solution to pil.23p . valid for all pi and p2, is 
obtained by solving the equation without the (5-function 
term for pi > p2 as in eq. piI.16p above (with A — l/w), 
and choosing the constant to obtain the correct discon- 
tinuous behavior at pi = p2 (because of the (5-function). 
Because of the initial condition Gw(0,p2) — 0, we impose 
the requirement that G^ must vanish for pi < p2 for all 
P2- The result is 



GwiPl,P2) 



Oijpi - P2) 



1 



wp2F'^-'^ 
rpi 

X exp / dp' 

•I P2 



iP2 



wF{p 



1 - wp'F{p') 



(III.25) 



Then is obtained as gw{pi,P2) = dGw{pi,P2)/dpi. 
Both gw and Gw can be seen to be complex analytic 
for w ^ [(T, 00). exhibits singular behavior if w is 
real and in [tr, 00) (as expected), and nowhere else, which 
means that the spectrum of K is precisely the interval 
(0, 1/cr]. For w ^ [a, 00), there is no solution of the ho- 
mogeneous equation obeying the initial condition that 



could be added to the solution. These results agree with 
the earlier statements for the two regimes \w\ < a and 
Pi < Pc- 

To extract the large m behavior of K*™', we write 



K*^{P1.P2) 



1 

27ri 



dw 



gw{pi,P2) 

,,,m+l 



(III.26) 



where the contour is a small circle around the origin. Be- 
cause gw is analytic in w except on [cr, 00), we can increase 
the radius of the contour until we hit the first singular- 
ity in w, which is located on the real axis at the value 
Wc defined by 1 — WcP2F'^^^{p2) = 0. Parameterizing 
the contour as w = Wce~" , the most strongly divergent 
piece of the integrand at this singularity will determine 
the large-TO behavior of K*"^ via 



dr 



27rm^ 



(ir)" 



r(z) 

■ZTO^~^e~"^r(l - z;-imn) 
im^-^e"^T{l - z;irmT), 



(III.27) 



where r{z;to) = J^dt e * is the incomplete gamma 
function. The first term on the right-hand side of pil.27p 
comes from the part of the contour coming in from r — 
0^ + ioo, encircling the singularity, and returning to r = 
0^ -|-ioo. If z is not an integer, there will be a branch cut 
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from T = to T = +ioo, but the same answer is obtained 
regardless. The last terms come from the segments of 
the contour running from r — —oo to t — — tt and t — tt 
to T = cxd; using the asymptotic form of the incomplete 
gamma function we see that 



ZTO^-^ (e-''^^r(l - z; ~imn) - e*'^^r(l - z; irmr)) 

2 . sin + + oil/m% (III.28) 



rmr^ 



and so these terms may be neglected relative to the first 
as TO — > oo. 

In order to make progress in closed form we specialize 
to cr = 2. Using (jllL^ . F{p) = {l-p)/p forp > Pc^ 1/2 
and 



exp / dp 



P2 



1 - wp'F{p'Y-^ 

p^(l-u;(l-pi)) 



, (III.29) 



^p^{l-w{l-p2)), 

taking pi > P2 > Pc- Defining hw{p) = 1 — w(l — p) for 



brevity, 



dpi [ hy,{p2) \Pihy,(p2), 

(III.30) 

It might appear that (IIII.30|) has an essential singularity 
at It; = 1, but in fact 



lim gw{pi,P2) = -7— 
w^i api 



d{pi -P2) -1 !- 

— -Cl P2 

Pi 



(III.31) 



and the singularity with smallest \w\ is located at Wc 
1/(1 — P2); gw may be rewritten as 



9w{Pl,P2) = 



d{pi-P2) ( P2hw{pi) 



dpi {wc - w)^ Vpi'(1 ~ 

(III.32) 

Because 1/2 < p2 < 1, we know 1 < Wc/{wc — 1) < 2 and 
the factor of {wc — w) is responsible for all divergences 
as w ^ Wc. To find the leading-order divergence of this 
factor, we use 



{Wc-W) '-"-^^{Wc-W) '-c-i + O (^{wc - w) -c-i+i log(wc - w)) , 

which, with w — WcC^" ^ diverges as {iT)~^^'P^ . We may now use (IIIL27P to evaluate 

1 



K*^{P1.P2) 



2ttwI: 



(III.33) 



(III.34) 



Because of cancellations that take place in (|III.30P when pi — P2, we must take the derivative with respect to pi 
explicitly before the to 00 limit. The result is 



K*"\pi,p2) ^ S{pi-P2)il-P2r+0{P1-P2)- 



Pl{Pl-P2Y \pi{l-p2) 
I 



(1 - Pl)(l - P2) fp2{Pl - P2) \ ^'"^ (1 - P2T m 



r(i/P2) 



(III.35) 



D. Asymptotics of the two-point function and mass 

In this section we complete the calculation of the im- 
portant correlation function C^'^\ We present two ver- 
sions: an essentially exact version for u = 2, and an 
asymptotic calculation valid for all a in the region pQ—pc 
small. 



J 



If we make no further approximations, when we calcu- 
late correlation functions such as (jIII.14p it is clearly eas- 
ier to integrate over p2 first and then find the large-TO be- 
havior rather than attempt to integrate (jIII.35[) directly 
over p2 ■ As an example we now calculate the asymptotic 
behavior of C^'^\m;po). Inserting 5m,mi+m2 in (IIII.14P 
gives 



/■pa 2 
C^'^\m;pa) ^ I dpidp2 

Jo ZTTW^ 



where 



'^wuW2{Pl,P2) ^ / dp[dp2 gwi{Pl,P'l)9w2{P2,p'2)'^(0;0,0){p'l,P2)- 



(III.36) 



(III.37) 
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with ^{()-Q,o){p'nP2) = 0{p[ — pc)S{p[ —p'-ijjp'x for a = 2. This integral is done easily: 



^w,w{P\,Pl) 



d d 



1 



dpi dp2 w + 1 



hw{Pi)hw{P2) 



prP2 



hw{p) 



(III.38) 



where p = min(pi,p2)- Clearly, since the integrand in pil.37p has no support below pc-, we obtain a branch cut 
starting at Wc = 1/Pc = 2 and this is the maximum radius the contour in w may take. Since the only divergence as 
w — > 2 comes from the first term in the square brackets, we obtain 



-{Pl,P2) 



d d ^2(^1)^2(^2) 
dpi dp2 pIpI 



logr 



(III.39) 



Then, using 



71" Arm 



m^TT + Oil/m) 



and 



/p, dp p 
we finally obtain 

C'^^\m;p„)r^ 
/ 2po-l 



P", dh2{p) 2po-l 

dp — — ^ = ^2 



(III.40) 



(III.41) 



— + 0(771 log m) 



(III.42) 



We see that C*^^^(77i;po) scales as m? /a™' for large ttt,, and 
this behavior is obtained for any poPc- 

Next we present an alternative calculation valid for all 
a when po — pc is small. We will define 6p = p — pc (but 
e.g. S{pi — P2) is a ^-function as usual). We then have 



pF[pY 



Pc - Sp, 



(III.43) 



valid as p ^ Pc from above. Then, using (jIII.25|) . can 
be calculated for w ^ [tr, 00). Because all the singular 
behavior as w a arises from the denominator of the 
integrand of pil.25p . we may approximate F'^~^{p) ~ 1 
in the numerator and 



Gn,{pi,P2) = 0{pi -P2] 



Spi + l/w ~ Pc 

w{Sp2 + l/w -pc)"" 



(III.44) 



for (5pi, 8p2 both small and positive. Then if wc consider 
the transform 



(III.45) 



we notice that the transform of the sum over ttii in eq. 
pil.141) becomes simply a product of g^s inside the in- 
tegral in eq. piI.12[) . and we neglect the factors F{pi), 
F{p2) as these do not affect the leading 771 dependence. 
Further we can neglect 2P(0;m) as it falls off faster than 



the term we keep. Also ^{o-o^q){pi,P2) — '2<7S{pi — P2) for 
Pi above Pc, and zero below. Finally, we will estimate 
the expected mass inside radius 777,, 

rn 

M{m,po) = C^^HO;Po) + I] (^ + l)a"'-id')(777',po) 

m' — l 

(III.46) 

directly, as this is similar to the definition of the trans- 
form of C(2): we simply evaluate the transform at w = 
cr(l — 1/777), which cuts off the sum at around 777. This 
is a value at which Gw is not singular. Thus we have to 
calculate 



Ci^HPo) = 2a{Spo + l/w-pcy 
dSp' 



1 



w'^ {Sp' + 1 /w — PcY 



(III.47) 



For fixed po, the dominant contribution comes from the 
lower limit, and contains (l—wpc)^^ times factors that go 
to constants as it; — > ct = l/pc- Hence the mass behaves 
as 



Msoitim,po) ~ ^a{a + l)Splm^ 



(III.48) 



as 777 — !■ 00 [we inserted a factor (cr + l)/<7 to account 
for the number of neighbors -I- 1 at the first step in 
M{m,po), as in eq. pll.46p ]. This method in fact differs 
from the definition above in using a soft cut-off for the 



sum over 777' instead of a hard one, m' < 



Now that 



the form of the summand is known, we can evaluate it 
using either form of cutoff. Hence we find that for the 
hard cut-off, the result is smaller by a factor 6: 



1 



M(?77,po) ^ ^cT{a + l)Splm^. 
Thus the correlation function behaves as 
C^'Hm,po)^^{a6po)^m'a-^, 



(III.49) 



(IH.50) 



which agrees with the result for a — 2. 

Equations pil.49p and pil.50|) are the main results of 
this section. From the structure of the expression for 
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Guj, the lower limit always dominates, so this m depen- 
dence holds for all po > Pc, at sufficiently large m. More 
precisely, the results are valid only if 6po is greater than 
order l/m. This means that m is much larger than the 
correlation length at po, which is proportional to 1/{Spq). 
The main contribution to the integral is from Sp' less than 
of order l/m. This is in agreement with the "superhigh- 
ways" idea 23, 24, 25]. 

Notice also that the factor Sp^ is present in both re- 
sults because the probability that one vertex is connected 
to infinity is Poo{Po) — '2,a{cr + l)Spo/{a — 1), for Spo 
small and positive, and so is oc Spq for two vertices (the 
two events are uncorrelated, because the two points are 
separated by more than the correlation length) . The de- 
pendence on (T is the same within subleading terms of 
relative order 1/cr. 



E. General fcth-order correlation functions and 
moments of the mass 



We may extend the method to estimate asymptotics 
of correlation functions of any order k, that is the prob- 
ability C(«i, . . . , ife;po) that some given set of vertices 
(labeled ii, i2, ■ ■ ■ , ik) are on the same connected com- 
ponent of MSF(po) and that this component is infinite. 
The procedure is straightforward: to compute the /c-point 
correlation function for a given set of k distinct vertices, 
one draws the smallest subtree of the BL such that all 
k given vertices are connected, and choose any vertex 
on this subtree as the root point, along which the con- 
nection to infinity occurs in the MSF(po) (eventually, we 
will sum over the possible root points). Thus the leaves 
of the subtree (i.e. the degree 1 vertices) must be among 
the given k vertices, but if any of the k given vertices are 
not leaves, they can be anywhere on the subtree. Starting 
from the root, we propagate out to (or possibly through) 
each of the k given vertices, along the subtree. The sub- 
tree can be viewed as made of chains of edges connected 
by degree-2 vertices, with the ends of the chains at either 
(i) the root point, which has degree > 1, (ii) the leaves of 
the subtree, or (iii) vertices of degree > 2 other than the 
root point. For each such chain e of rUe steps, we asso- 
ciate the iterated kernel K*^'{pi,pj), where the labels z, 
j are associated to the two ends of the chain, with i the 
end further from the root point. For the initial distribu- 
tion at the root, if there are n chains leaving it (n < ct), 
we generalize ^(oio.o) to 



$(0;0,...,0)(Pl, • ■ • ,Pn) 

d 



dpi 



[1- F' 



(T-'rl—n 



(pl)) 



l[Sip,-p,). (III.51) 

J=2 



degree n 7^ 2, we must associate with it a factor 



Vn{pi,P2.--,Pn)^ F^-^ipi) n 5{p^ -p,). (III.52) 

After multiplying together all these factors, we must inte- 
grate over all the parameters like pi between the limits 
and Po- There are two of these parameters for each chain 
on the subtree; clearly some could be eliminated using the 
5 functions. Finally, we must sum over all possible root 
points on the tree. This procedure yields C(ii, . . . , ik'^Po) 
(unlike our earlier description for the k — 2 case, there 
are no exceptions to this prescription for cases of vertices 
coinciding with each other or with the root point). 

The higher-point correlation functions can be used to 
calculate higher moments of the mass, M{m,po)'' . These 
are the average of the fcth power of the sum over posi- 
tions at distance less than m from the origin of the "indi- 
cator function" that is one if and only if the vertex is on 
the same connected component of MSF(po) as the origin. 
Af (mjPo)*^ is equal to the sum of the k + 1-point corre- 
lation function C(ii, . . . , ik+i',Po) over all positions of 12, 
. . . , ik+i within m steps of the origin at ii. 



For TO large, the largest contribution to M{'m,po)'^ 
will come from configurations of , (I =^ 1 , . . . , 
fc -f 1) for which, in the subtree in the calculation of 
C(ii, . . . , «fc+i;Po)) all the given vertices are at its leaves, 
the root point has degree 2, and the vertices of degree > 2 
have degree 3. For these there are 2fc chains (iterated ker- 
nels) in the subtree. We can estimate the power of to as 
TO — *■ cxD using the same approximations as for fc = 1. 
The sums over position are estimated by using the prop- 
agator gw in place of all iC*'"='s, with w — — l/m) 
in each one. The factors at the vertices of the subtree 
(other than (5-functions) can be dropped, at least when 
5po is small (and for larger 5pQ do not affect the scaling 
behavior). The integrals over the pi associated with the 
leaves can be done by using in place of for these 
chains. The remaining integrals over piS associated with 
the other vertices and the root point are dominated by 
the lower limit 5pi — 0, and can be estimated by power 
counting. As each additional leaf on the subtree leads to 
an extra factor GwQw and one additional integral similar 
to that at the root (as for fc = 1 above), this yields finally 
(neglecting constant factors) 



M{m,po)'' - TO^* - [M{m,pa)]\ 



(III.53) 



Similarly, by comparing dP^(pQ)/dpQ with ^oip') and 
$(0;o,o)(pijP2)7 see that at a vertex of the subtree of 



As all fcth moments scale like the fcth power of the first 
moment, this means that the (random) mass M{m,po) 
does not have a very broad distribution, and its typical 
behavior is well-described by its expected value. Hence 
the connected components of MSF(po) on the BL are not 
multifractals. 
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F. Fractal dimensions 

So far we have developed a method for computing cor- 
relation functions on the BL, while our real interest is 
in lattices in Euclidean space of dimension d. For suffi- 
ciently high d, we would expect that a mean field the- 
ory holds for quantities such as exponents; this assertion 
will have to be justified post hoc in a subsequent paper, 
by a perturbation analysis of corrections due to fluctua- 
tions neglected in the mean field theory. The BL results 
provide the mean-field theory results, once we have ex- 
plained how to convert them to apply to Euclidean space. 

For a hypercubic lattice on Euclidean space, if we 
choose a path (starting from the origin) randomly (with 
equal probability for each), then it behaves as a random 
walk, and after m steps will be of order ^/Tn in Euclidean 
distance from the origin. In the set of all paths from the 
origin, any two paths initially coincide but ultimately 
part company. If we neglect the possibility that they 
subsequently intersect (and also that a path may inter- 
sect itself, including by backtracking), then the union of 
the paths forms a tree, equivalent to the Bethe lattice 
with z ~ 2d. Hence in this correspondence, separations 
on the BL behave like distances squared on the Euclidean 
lattice [5l| . 

m ~ (IIL54) 

which allows us to infer mean-field scaling dimensions 
from the BL theory. More formally, to apply the Bethe 
lattice results as an approximation for the lattice, in 
equation pil.7|) we must now sum over all neighbors of 
the given site, since the path may go through the site in 
any direction. Equation piI.19[) needs to be replaced by 
a Fourier transform 

$k(p)=^e^'^''$,(p), (IIL55) 

X 

which means that in pil.20p and all subsequent equations 
we make the substitution 

w^Yl e"'*""' = 2d - + 0(|k|4), (IIL56) 
no- 
where {fij} are the basis vectors of the hypercubic lattice. 
This leads to the same relation pIL54|) . 

We established that the probability the path from any 
vertex to infinity passes through a given vertex a distance 
m away behaves as P(o;m) (po) cr~'"; summing this over 
all sites within a ball of radius m on the BL gives 

m 

P(0;0){P0) + Yl ('^ + l)'^'"'"'^(0;m')(P0) ^ m. (IIL57) 
rn' — l 

Then we expect that in Euclidean space, the mass of 
the path lying within radius R scales as Mp{R,pQ) ~ R^, 
consistent with the picture of this path (mentioned earlier 
for the BL) as a random walk, with dimension Dp = 2. 



This is the same as the dimension of the backbone of a 
critical percolation cluster for d > dc = 6 [36j . 

Similarly, the mass of the component connected to the 
origin on the BL, M{m,po) ~ becomes AI{R,po) ~ 
R^ within radius R in Euclidean space, meaning that a 
connected component of the MST has a mean-field frac- 
tal dimension D — 6. Because the union of the spanning 
trees fills the lattice, this strongly suggests that the crit- 
ical dimension of the MST will be dc = 6, as discussed 
in Section Hi El This is the same critical dimension as for 
percolation (at threshold). We emphasize again that the 
result is valid for pq greater than of order 1/m ^ 1/ R^. 
Since the correlation length ^ behaves as \pq —pc| 
with i^pcrc = 1/2 for d > 6, this means it holds for i? > ^, 
and thus involves distance scales at which ordinary cor- 
relations for percolation decay exponentially. 

G. Poisson- weighted infinite tree 

Another mean-field model for MST (and other ran- 
dom optimization problems), which mimics the contin- 
uum model in Euclidean space, is based on the Poisson- 
weighted infinite tree (PWIT) (s^l- This is a tree with 
infinite degree at each vertex, and the weights (or costs) 
£ for the edges emanating from each vertex are given by a 
Poisson process on £>0 with density p(£) (x t'^-'^ . This 
is the same as the measure on the set of distances between 
the points of the uniform Poisson process on K^. Thus 
the model is obtained by taking the continuum model, 
and ignoring all correlations induced by the geometry of 
space, keeping only the probability distribution for sep- 
arations of points. 

The PWIT can be viewed as the limit cr ^ oo of the 
iid BL model with degree cr-l- 1 we have used above, with 
a different probability distribution on the edges (as we 
have seen, the choice of this does not affect the geome- 
try of the MST). For a + 1 neighbors, we can cut off the 
distribution with density p{i) a.t £ ^ 0{{a + lY^'^), then 
study the finite-degree trees in the limit. For each d > 1 
in this model, the percolation threshold ic is non-zero, 
and because the behavior is dominated by edges close to 
threshold, the results for the PWIT will be in the same 
universality class as the BL model above. Indeed, if we 
set £ = {a + l)p, the BL model as cr — > oo coincides 
with the d = 1 PWIT. Notice that the hmit of our ex- 
pressions exists, because by writing F{p) = 1 — G{£)/a, 
then F'^~^ e"'^, and the fixed-point equation for F 
becomes [13] 

G(£) =^(l-e"^(^)). (III.58) 

In the transforms, we should also set s = w/a. Then 
the limit of our theory makes sense for the masses M, 
Mp within m steps of the origin: M ~ ^Si'^rn^ (again, 
Se = £- £c, where here £c = !)■ For the PWIT one 
would naturally wish to express such quantities in terms 
of the distance £ defined as the sum of the £e^s along 
a path on the tree, and since most edges accepted are 
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either below, or not far above, ^c, these masses M{£), 
Mp{t) scale the same way, and again ^ ~ i?^ because the 
paths are random walks. When this measure of distance 
is used, the number of connected components inside I is 
also well-behaved. Note that because it is believed that 
continuum percolation is in the same universality class as 
bond percolation, we would expect universal properties 
of the continuum MST to be the same as those of the 
lattice model anyway, so that all of these conclusions are 
consistent. 



H. Random, locally treelike graphs 

The results of the previous sections admit a simple 
generalization to certain classes of random graphs. We 
consider a tree where the coordination number of each 
vertex is an iid random variable distributed according to 
p{a). 

In what follows, averages with respect to p are denoted 
by angle brackets. The fixed-point equation for F{p) be- 
comes 

oo 

F{p) = \^p + pY,p{o)F{pr. (III.59) 

(T=0 

Again, F{p) is defined as the smallest solution at fixed 
p. In what follows we assume 1 < {a) < oo, which are 
the conditions necessary for the graph to admit a con- 
ventional percolation transition: F{p) = 1 for p < Pc — 
1/(ct), F{p) < 1 for p > Pc, and we can construct a 
nontrivial ensemble of MSTs under wired boundary con- 
ditions at infinity. 

At this point we find it helpful to multiply the ker- 
nel defined in (jlll.Qp by a factor of cr. This has the ef- 
fect of summing the connectedness functions over lattice 
sites (done in, e.g., (|III.46|) ) simultaneously with averag- 
ing over edge costs. This modification allows us to relate 
the spectrum and eigenfunctions of the random graph 
kernel (crK) back to those obtained above: e.g, (<I>o(p)) 
is an eigenfunction of {aK) but not of {K). 

The entire calculation goes through as before, with oc- 
currences of F{pY~^ in the BL Green's function replaced 
by {aF{pY~^) and the transform variable w rescaled by 
1/(ct). Under the additional assumption (a^) < oo, one 
may repeat the asymptotic analysis of section IIIIDI and 
obtain 

(Msoft(m,po)) - + l))^plm\ (III-60) 

The generalization extends to the higher moments of the 
cluster mass discussed in section UlI El 



One may also consider "quenched" moments of cluster 
masses, of the form 

MkAm,Po) = l^(M{m,pof)^'^ (III.61) 

for £ > 1. These quantities cannot readily be calculated 
with the techniques discussed above and are beyond the 
scope of this paper. 



IV. CONCLUSION 

In this paper, we have achieved the following results. 
For a finite graph, we defined a process MSF(po) which 
is a random forest that becomes the MST for p^ — 1. 
Using the Bethe lattice (BL) with wired boundary con- 
ditions, and taking the infinite size limit, we showed that 
the infinite connected components of MSF(po) on the BL 
contain of order vertices within m steps of any vertex 
on this component, for any po greater than the value pc of 
the threshold for bond percolation on the BL. This result 
is essentially rigorous. Transferring it (heuristically) to 
Euclidean space, this means that the mass of an infinite 
connected component of MSF(po) within a ball of radius 
R scales as with D = 6, for d sufficiently large and 
Po greater than the value Pc of the threshold for bond 
percolation on the lattice used. This then implies that 
for d > 6 there are of order R'^~^ large connected com- 
ponents that intersect such a ball. We also gave a non- 
rigorous second argument for these results, using scaling 
ideas (this argument directly addresses the critical di- 
mension dc above which the results hold). The results 
also hold (rigorously) for the Poisson-weighted infinite 
tree, and (heuristically) for the continuum MST model 
in Euclidean space. 

Following the reasoning of NS [1, 0] , these results for 
the MST imply that the strongly-disordered spin-glass 
model has an uncountable number of ground states for 
d > 6, of which of order 2^'^ ^ can be distinguished 
within a ball of radius R. For d < 6, the logarithm of 
the number of ground states is smaller than any power 
of R, and possibly only of order one, or simply one (with 
probability one). 
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